Practical characterization of quantum devices without tomography 
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Quantum tomography is the main method used to assess the quality of quantum information processing de- 
vices. However, the number of experimental settings and the data processing time required to extract complete 
information about a device via tomography grows exponentially with the device size. Part of the problem is 
that tomography generates much more information than is usually sought. Taking a more targeted approach, we 
develop schemes that enable (i) estimating the fidelity of an experiment to a theoretical ideal description, (ii) 
learning which description within a reduced subset best matches the experimental data. Both these approaches 
yield a significant reduction in resources compared to tomography. In particular, we demonstrate that fidelity 
can be estimated from a number of simple experiments that is independent of the system size, removing an 
important roadblock for the experimental study of larger quantum information processing units. 



The building blocks for quantum computers have been 
demonstrated in a number of different physical systems JTJ-fS) . 
In order to quantify how closely these demonstrations come 
to the ideal operations, the experiments are fully character- 
ized via either quantum state tomography [7 1 or quantum pro- 
cess tomography [8|. An important advantage of these meth- 
ods is that they require only simple local measurements. The 
main drawbacks however are that tomography fundamentally 
requires both experimental and data post-processing resources 
that increase exponentially with the number of particles n J9] . 

It is important to realize that the exponential cost of tomog- 
raphy is not a problem restricted to a large number of qubits. 
For example, recent ion trap experiments characterizing an 8 
qubit state required 10 hours of measurements, despite col- 
lecting only 100 samples per observable [3|. Surprisingly, the 
post-processing of the data obtained from these experiments 
took approximately a week ifTUll . Under similar time scales, 
the characterization of a 16 qubit state would take years of 
measurements, and over a century of data post-processing. 
This is clearly a major obstacle in the demonstration of work- 
ing quantum computers, even at sizes moderately larger than 
what has been demonstrated to date. 

Moreover, one of the key assumptions for the fault- 
tolerance theorems of quantum computation is that the noise 
on elementary components does not scale badly with the 
system size ifTTI . Therefore, despite the fact that univer- 
sal quantum computation can be realized with one- and two- 
qubit elementary operations, it is not sufficient to characterize 
small gates — larger systems may have significant noise con- 
tributions from correlated sources as seen in recent experi- 
ments [6 1. The characterization of multi-qubit states and oper- 
ations provides crucial information for the verification of these 
assumptions, and therefore the development of large quantum 
information processors. 

Part of the problem with the usual approach is that to- 
mography often provides more information than what is truly 
sought. Given an experiment that prepares a quantum state 
represented by a density operator a, one usually extracts a 
complete description for a via quantum tomography, and then 



compares this description to a theoretical state p by comput- 
ing the fidelity F(p, a) — a single number, commonly used as 
similarity measure. As this example illustrates, we often have 
an idea of what has been realized in the laboratory, so we are 
interested in asking for much less information — e.g., we only 
want to know the distance to some particular theoretical tar- 
get or to learn the identity of the state or operation within a 
restricted set of possibilities. 

In this Letter, we develop targeted approaches to directly 
extract the information of interest. Our main results, summa- 
rized at Table[lj show that it is possible to efficiently character- 
ize a large class of states and operations — including some that 
are universal resources for quantum computation — without re- 
sorting to tomography and using only local measurements and 
the preparation of product states. Our methods apply to dis- 
crete variable systems such as qubits, as well as continuous 
variable systems such as oscillators. We consider two types of 
characterization: certification and learning. 

Learning consists of identifying the theoretical description 
from a restricted set of possibilities that best matches the ex- 
perimental data. There exists many classes of "variational" 
states in physics that can be specified with a small number of 
parameters. We provide examples where these parameters can 
be extracted directly from experiments, circumventing tomog- 
raphy and hence drastically reducing the complexity. 

Certification consists of estimating the fidelity between an 
experimental device and some theoretical target. We demon- 
strate that certification always requires drastically less re- 
sources than full tomography — in some important cases, it is 
an exponential reduction in resources. Even in the worst case, 
our scheme offers four significant advantages for the charac- 
terization of quantum states (equivalent statements hold for 
quantum operations): (1) Its computational cost is bounded 
by rt 2 4™, compared to 4 3 " required for the simplest tomog- 
raphy procedure based on pseudo-inverses. (2) The num- 
ber of distinct experimental settings it requires is constant — 
independent of the system size and depending only on the de- 
sired accuracy of the estimate — compared to the 4™ distinct 
experiments needed by tomography, or the 0(n2 n ) settings 
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Stabilizer 
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General MPS 


O(n) 
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Clifford 


0(1) 


0(1) 


poly(ra) 
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General unitary 


0(n 2 2 4 ") 


e>(2 2 ") 


0(2 12 ™) 


Evolution 


Local Hamiltonian 






O(n) 




Local Lindbladian 






C(n) 



TABLE I. Complexity of the characterization of various states and 
processes. Entries in red are efficient, i.e. require resources that grow 
at most polynomially with the number of qubits n. The Sampling 
column gives the complexity of the classical processing required to 
sample from the relevance distribution, c.f. CI. The Fluctuations 
column gives the number of measurements required to suppress sta- 
tistical fluctuations when evaluating the fidelity, c.f. C2. The Learn- 
ing column gives the total number of measurements (including rep- 
etitions of the same measurement setting) required to learn the state 
within a restricted set; the classical processing is always a polyno- 
mial of that number. When both fidelity estimate and learning are 
efficient, it is not necessary to assume that the state belongs to a re- 
stricted set as fidelity testifies of that assumption. Stabilizer states, 
Clifford gates, Local Hamiltonians and Lindbladians are discussed 
in the main text. The W state has often been used as an experimental 
benchmark, e.g. (3). The \t n ) state plays a key role in linear optics 
quantum computation 1 14 1. Matrix product states (MPS) accurately 
describe ground states of ID quantum systems 1151 . An important 
example of a process with MPS Choi matrix is the approximate quan- 
tum Fourier transform [161, key component of Shor's factoring algo- 
rithm. Question marks indicate open problems, but they can be no 
worst than the general states and operations. 

required by compressed sensing techniques fl2l . (3) The 
total number of measurements (counting repeated measure- 
ments used to statistically estimate expectation values) of our 
scheme is bounded by 0(2"), which is at least a quadratic im- 
provement over what is required by full tomography. (4) The 
data post-processing of our scheme is trivial, while the correct 
method of processing tomography data is a matter of current 
debates and different methods produce significantly different 
results HOI . 

The rest of this Letter is structured as follows. In the next 
three sections, we describe the state certification scheme for 
qubits, show how it extends to continuous variable systems, 
and the certification of quantum processes. Then, we present 
concrete examples drawn from Table [I] 

Monte Carlo state certification — To estimate the fidelity to 
some theoretical pure state p, we use the fidelity 

i i 

where pi = tr pPi, o~i = tr<7pj, d is the dimension of the 
Hilbert space, and P, is some orthonormal Hermitian opera- 
tor basis satisfying tr PjP 3 = dSij. For a system composed 
of n qubits, the Pj could be the 4™ Pauli operators obtained 
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FIG. 1. (a) Wigner function representation of a harmonic oscillator 
in the superposition = a) + |— a) for a = 3, (b) 10 3 samples of 
points in the complex plane drawn according to the relevance density 
of \tp), (c) Wigner function representation of a harmonic oscillator in 
the incoherent mixture of \a) and |— a), corresponding to the prepa- 
ration of <r, (d) absolute error in successive estimates of the fidelity 
F(p, a) for 5 different runs with 10 3 samples each. 

by taking tensor products of the Pauli matrices and the iden- 

2 

tity. Defining the relevance distribution Pr(z) = ^f-, we can 
rewrite the fidelity as F(a, p) — J^i P 1 "^)^' where the sum 
is taken over only the i with pi ^ 0. This expression leads 
to an experimental procedure to estimate the fidelity based on 
Monte Carlo methods as follows: one generates N random in- 
dices ii, 12, ■ ■ . , In following the relevance distribution Pr(i) 
and estimates <7; fc = (Pi k )^, the experimental expectation 
value of the observable Pi k . With high probability, the fidelity 
is close to J2k=i ^7 w ^ t ' 1 an uncertainty that decreases as 
-j=. The total number of distinct experimental settings is at 
most AT, independent of the system size. 

There are two important caveats to this technique: 

CI Generating an index i according to the relevance dis- 
tribution Pr(i) can in general require an exponential 
amount of computational resources. 

C2 Each <7j fc is estimated within some finite accuracy. To 
estimate the fidelity with accuracy e therefore requires 
repeating the measurement of P Jfc roughly {epi k )~ 2 
times, which in the worst case grows exponentially with 
the number of qubits. 

These are important limitations, and as a consequence our 
method will not scale polynomially for all quantum states and 
operations, but nevertheless always does significantly better 
than tomography. In addition, there are important classes of 
states and operations which avoid these two problems (see Ta- 
ble [I] and the Supplemental Material for complete details). 
Continuous variables systems — For infinite dimensional sys- 
tems, such as a harmonic oscillator or a single optical mode 
in a cavity, it is more convenient to describe a state p by its 
Wigner functions Wp{a) ifTTI (other indicator functions could 
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also be used). Equation ([TJ becomes 

W ) = I/*«p(«)^ (2) 
Jc Wp{a) 

where the relevance density p(a) — W?(a) is defined as the 
square of the Wigner function of the theoretical state, whose 
purity guarantees once again that p(a) is well defined as a 
probability density. The Wigner function of the experimen- 
tal state a can be measured by interactions with an atom and 
measurements of the atom's state ITSl . Points in the complex 
plane can be selected according to p(a) using simple meth- 
ods such as rejection sampling. As an example, we simulated 
this proposed method to estimate the fidelity between a quan- 
tum superposition of two harmonic oscillator states — a "cat" 
state 4^ (| a) + \ — a)) — and the probabilistic mixture of those 
two classical states. For the given choice ofparameters, this 
fidelity is 1/2(1 + e~ 2a ) w 0.5, and Fig. Ill clearly demon- 
strates a close agreement between the Monte Carlo estimate 
and the exact theoretical value, as the absolute error decreases 
like the square-root of the number of samples of the Wigner 
function. As expected, the error in the fidelity estimate does 
not depend on the state itself (e.g. average number of pho- 
tons, amplitude, etc.) but only on the number of samples. We 
emphasize once again that no estimate of the Wigner function 
of the experimental state is ever made, so there is no need for 
maximum-likelihood fits to the data, or Radon transforms. 

Monte Carlo process certification — The Choi- 
Jamiolkowski isomorphism [19| associates to every quantum 
operation £ on a tf-dimensional space a density operator p£ 
on a <i 2 -dimensional space via p £ = (id(8>£) (\4')( < P\)) where 
= 53i=i I*) ® I*) ar, d id is the identity operation. 
As with state certification, our goal is to compare a target 
unitary U to its experimental realization U. A good figure 
of merit in that case is the average output fidelity F(U,U), 
defined as the fidelity between the output states produced by 
U and U, averaged uniformly over all pure input states. It can 
be shown that F(U,U) = rfF( ^" )+1 11201 . reducing the 

problem of comparing two processes U and U to the problem 
of comparing two states p u and p^. This problem is solved 
by the Monte Carlo state certification presented above. 

While this derivation makes use of the maximally entangled 
state \(/>), the experimental realization of the protocol requires 
only the preparation of product states. A direct implemen- 
tation of the quantum Monte Carlo state certification would 
prepare a maximally entangled state \<p), apply U to half of 
the system, and then measure random Pauli operators on all 
qubits. A more practical approach consists of preparing the 
complex conjugate of random product of eigenstates of local 
Pauli operators (corresponding to the resulting state after half 
of the entangled state is measured destructively), applying the 
transformation U to the system, and finally measuring a ran- 
dom Pauli operator on each qubit. This simplification, based 
on the identity {\p)(p\ ® id)\<f>) = \p) ® | /•*)*, generates the 
same statistics as the direct scheme ETI . 



Computation via teleportation — Some of the most promis- 
ing approaches to universal and scalable quantum com- 
putation are teleportation-based quantum computation ll22ll 
and measurement-based quantum computation (231 . Both 
these approaches rely heavily on the preparation of stabilizer 
states 1 24 1 and the application of quantum operations known 
as the Clifford group [22|, which map stabilizer states to sta- 
bilizer states. Stabilizer states are also important for quantum 
computation in general because of their close relationship to 
a large class of quantum error correction codes known as sta- 
bilizer codes. Many of the experimental demonstrations of 
state preparation to date have been of stabilizer states, such as 
states encoded into stabilizer codes [2J, cluster states |4|, and 
the GHZ state |00 ■ • • 0) + |11 ■ ■ • 1) 00. 

We first describe how to certify these states and operations. 
Stabilizer states are defined to be +1 eigenstates of some set 
of commuting Pauli operators Sj that generate the stabilizer 
group, i.e. Sj\ip) = \if>) for all j — 1, . ..n. It follows that 
Pr(i) = 1/d if either of ±Pj is in the stabilizer group and 
otherwise. Sampling from Pr(i) thus amounts to generating 
an index i uniformly between 1 and d, avoiding the problem 
associated with caveat CI. For the same reasons, pf = 1 for 
all i with Pr(i) ^ 0, so that the uncertainty in the estimation 
of er.j is not amplified, avoiding the problem associated with 
caveat C2. It also follows that the fidelity F(a, p] to a stabi- 
lizer state p can be estimated with error e using N — 0{\) 
experiments involving only local projective measurements, in- 
dependently of the system size and without any prior knowl- 
edge of the experimental state a. Since this result relies only 
on local measurements, it can immediately be generalized to 
states which are locally equivalent to stabilizer states. 

This result carries over directly to the certification of Clif- 
ford operation because their Choi-Jamiolkowski density op- 
erators are stabilizer states. In the case of Clifford transfor- 
mations similar results can be obtained using "twirling" ex- 
periments 1 25 1 or by the selective measurement of matrix ele- 
ments of the Choi matrix [21|, although the Monte Carlo ap- 
proach described here generalizes to other cases. 

While operations in the Clifford group are not sufficient to 
perform universal computation [22], single qubit rotations can 
be used to reach universality, and these can be certified effi- 
ciently thanks to local equivalence of either operations (if the 
rotation is applied directly) or state preparation (if the rotation 
is applied via "magic state" teleportation Il22ll26l ). 

Stabilizer states can also be learned efficiently, as pointed 
out by Aaronson and Gottesman ll27ll . although the known 
method for efficient stabilizer learning requires entangling 
measurements. Aside from the direct generalizatin of the sta- 
bilizer approach, Clifford group operations can be learned 
efficiently [28 1 if one has access to Bell measurements and 
the inverse of the operation being learned. The problem of 
performing these tasks efficiently with strictly local measure- 
ments and without the need for the inverse remains open. 

Local Hamiltonians and Lindbladians — Models of univer- 
sal quantum computation exist where the idea of discrete gates 
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is not a natural fit. Instead, the system evolves in a continu- 
ous way, governed by some dynamical equation = Qp. 
The most direct way to determine how accurately these dy- 
namics can be realize is to estimate the time evolution gen- 
erator Q of the system, and explicitly check how it compares 
against the ideal target generator. Important examples include 
local Hamiltonians and Lindbladians that are universal for adi- 
abatic quantum computation ll29l and dissipation-driven quan- 
tum computation [30] respectively. 

In what follows we demonstrate how to learn such local Q 
using only (i) the preparation of initial product states, (ii) the 
simultaneous measurement of a constant number of single- 
qubit operator, (Hi) a number of experimental settings that 
grows linearly with the system size, (iv) and classical post- 
processing of complexity n 3 (inverting an cn x cn matrix for 
some constant c); improving on OTI . 

Consider the case of coherent evolution generated by some 
Hamiltonian H. For a short time t, the expectation value of 
any observable A evolves as 

- trip = it([H,A})p + 0(\\H\\ 2 t 2 ). (3) 

By experimentally measuring this expectation value, we ob- 
tain one linear constraint on the Hamiltonian. Varying over 
different observables A\ and initial states pj, we obtain more 
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linear constraints that we can write as Wi- 
tr AiPj — it([H, Ai])p. where we have dropped the higher 
order terms 0(||i7|| 2 t 2 ). Writing H in an operator basis H = 
2~2i^iPi, we obtain the linear equation Wij = 2^2iTij,ihi 
where Ty j = it tr Pj[Pi, Ai]. The Hamiltonian can be 
learned by inverting this linear equation OTI . 

There are in general a number important caveats to this ap- 
proach, although all of these disappear when the Hamiltonian 
is local, which is nonetheless sufficient to achieve universal 
quantum computation ||29l [30l . The Lieb-Robinson bound 
ll32l shows that only the Hamiltonian Hr in a region R a 
distance d « vt of the local observable A contributes to its 



evolution, i.e., e lHt Ae 



iHt 



iH n t 



Rt Ae 



-iH R t 



(for details of 



the proof see the Supplemental Material). This fact solves all 
the problems associated to the proposal of ET1 : 

1) The error 0(||i?|| 2 i 2 ) appearing in Eq. ( pO] ) becomes 
0(\\H R \\ 2 t 2 ) = 0(|| A|| 2 i 4 ), independent of the system size. 
Thus, it is not necessary to decrease the evolution time t as the 
system size increases to achieve a given accuracy. 

2) Because the Hamiltonian is local, the number of non- 
zero terms hi is proportional to the number of particles in any 
finite dimension. Thus, in the linear equation for Wij, the 
range of the index I increases only linearly with the number 
of particles, as opposed to the exponential growth for generic 
Hamiltonians. 

3) Because the dynamics is local, Ty; = 2y/ j when pj 
and pj> differ only outside a region of radius k away from 
the local observable Ai. In addition, the T become lin- 
early dependent — and thus redundant — when the input states 
are linearly dependent. For each observable Ai, we only 
need to vary the initial state locally, so the total number of 



observable-state pairs (ij) grows linearly with the number of 
particles. Thus, learning the Hamiltonian — or equivalently 
the hi — amounts to inverting the linear-size linear equation 

W ij =EiT ij ,ihi- 

4) Product input states form a complete operator basis, so 

they are sufficient to gain all information about the Hamilto- 
nian. Thus tr Aipj can be easily computed since Aj is local 
and pj is a product state. The quantity tr pj [Pi , Ai) can also be 
evaluated efficiently because the commutator of two fc-local 
operators is at most 2/c-local, and pj is a product state. 
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Statistical bound for Monte-Carlo estimation of the fidelity 

We present rigorous bounds for the error of the Monte Carlo 
fidelity estimate in the case of an n qubit system. The result 
(c.f. Eq. Q) can be adapted to the case of continuous vari- 
able systems through the minor modification of replacing the 
expectation value pi h by the value ^Wp(ai k ) of the Wigner 
function of state p at point a ik . 

Theorem 1. Let p = *jPi be the decomposition of the 
pure state p over the orthogonal Hermitian operator basis 
{Pi} where tr Pi Pj = dSij and the operator norm \\Pi\\ < 1. 
One can obtain an estimate F of the fidelity F(p, a) between 
p and a with error e = e± + 62 such that 



Pr(|F-F| > e) < 



2exp 




(4) 



where 



• I = {i\ . ..•jjv 1 } are N\ indices sampled from Pr(i), 
corresponding to observables P lk to be measured ex- 
perimentally on a 

• iVJ, is the number of experimental samples taken to 
estimate &i k = tr Pi k a 

• ei is the error associated to the Monte Carlo estimate 

• £2 is the error associated to the experimental estimation 
of the {<Ji} ie i 



Proof. The fidelity F(p, a) can be rewritten as 

F(p,a) = Y! 



d pi 



(5) 
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where prime indicates that the summation runs only over non- 
zero values of p,-. Since tr/5 2 = 1 by assumption, Pr(i) = 
pf/disa normalized probability distribution. We can thus in- 
terpret the fidelity as the expectation value of a random vari- 
able X which takes value Oj/pj with probability Pr(i). Its 
variance is bounded by a constant, as 



Var(X) = £ 



d 



F 2 < tro- 2 - F 2 < 1, 



and thus, using Chebyshev's inequality, we obtain 

Pr(|F-F x | > ei ) < ' 



(6) 



(7) 



where Fi = J2iei a %l P% I s tr, e estimate of the fidelity by sam- 
pling N\ realizations of X, i.e., by drawing I = {i\ . . . } 
indexes from the probability distribution Pr(i) and estimating 
E(X) by the realization of X = TVf 1 J2 ie i X i where a11 x i 
are independent and distributed as X. Thus, the number of 
measurements settings does not depend on the dimension of 
the system and scales as 0(l/e 2 ). 

The expectation value <7j of each observables with respect 
to the experimental state a can only be estimated up to finite 
precision. For each if. e /, the observable Pj fc is measured on 

the experimental state and yields a number yf^ whose abso- 
lute value is bounded by the operator norm of the observables. 
This measurement is repeated jW, times and the approximate 

realization of X k is a ik /p ik = (ft./A^ 1 ) TZ=iVt k ] - 
This estimation proceadure is then repeated for each of the N\ 
observables. Hoeffding's bound l34l states that, if the inde- 
pendent real random variables Yi are such that a, < Yi < b i7 
then for S = Y x + Y 2 + ■ ■ ■ + Y n , 



Pt(\S-(S)\ >t)< 2exp 



2t 2 



(8) 



In our case, for all k, we have —1/ \pi k \ < yf^ / p% k < 1/ \pi k \ 

for all iVJ, experimental measurement performed to esti- 
mate <Ti k and we can apply the Hoeffding's inequality to all 
N = Y2k=i experimental samples to bound the distance 
between the sum F of a ik / p ik by 



Pr(|F-Fi| > e 2 ) < 2exp 



e 2 N 2 



1 



k=i Pl k N : 



[k] 



(9) 

Finally to reach eq. |4]), one applies the union bound to \F — 

F\ < \F-Fil + " □ 

As can be seen in the last term of Eq. Q, observable Pi k 
must be sampled ^> pf k times to obtain an accurate es- 
timate of its expectation value. While this can be large in 
general, there are many important cases where the pi are only 
polynomially small, leading to a polynomial ■ Two exam- 
ples of cases of interest beyond the stabilizer states presented 



in main text are the W state [3] and the \t n ) state used in lin- 
ear optics for heralded teleportation with high success prob- 
ability [14]. Both are MPS with bond dimension 2 and are 
uniform superpositions of a linear number of computational- 
basis states. For both states, the expectation value of a Pauli 
operator P is given by 



\P\i>) = a(n)J2(i\P\j) 



(10) 



1,3 



where a(n) is l/n for the W state and l/(n + 1) for \t n ), 
and the sum runs over computational states that appear in the 
decomposition of the state. For all there exists a Pauli 
operator &ij such that \j) = Since the Pauli operators 

form a group, -Pay is another Pauli operator and all terms 
appearing in the sums are ±1. Thus, the smallest non-zero 
Pauli expectation scales as 1 /n, and the number of samples 
required to estimate <7j /pi to constant accuracy scales as n 2 in 
the worst case. 

More generally, we can improve the error bound Eq. Q by 
truncating the relevance distribution. Define the set of neg- 
ligible expectation values as S = {pi such that \pi\ < d~ a } 
where a is a positive number to be determined. We split the 
fidelity into a significant and a negligible contribution 



P,<t:S 



PiOj 

d 



Pi^i 
d 



(11) 



and bound the negligible contribution using 



£ 

Pies 



d 



< 



p t es 



1^1 
d 



max \Pi 

zes ^ 



< d -(<*+D y \ a A (12) 



£ 

Pies 



The sum of a subset of |<7j| is bounded by the sum over all 
\<ji\. To bound ^ \<?i\, we can use the constraint on the pu- 
rity of the state ^ of = d tr a 2 < d. The sum of absolute 
values is maximal when all absolute values are equal, which 
follows from standard Lagrange multiplier techniques. The 
purity constraint finally leads to 



£ \<Ti\ < dVd tra 2 < d 3/2 . (13) 

i 

Inserting this inequality that into eq. < fT2] ) yields 



£ 

Pies 



PiOt 

d 



(14) 



Hence, the sum over negligible pi vanishes exponentially for 
a = (1 + e)/2, i.e., when we drop all expectation values 
smaller than d ~ in absolute value, for any constant e > 0. 

We thus modify the sampling method in the following 
way. For each observable Pi picked from sampling the rel- 
evance distribution, compute the corresponding expectation 
value pi — tr pPi. When p 2 < d~ 1 ~ e , reject this entry, oth- 
erwise you proceed as before. It is important to verify that this 
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modification does not slow down the procedure, i.e. that we 
are not constantly rejecting samples. To see this, notice that 
the probability of choosing an element from the negligible set 
is bounded by 



2-e 



< dr 



(15) 



Since we reject all negligible pi, the maximum number of 
repeated measurements needed for a given experimental set- 
ting scales in the worst case as d 1+t . In particular, for qubits, 
the maximum number of measurements is 2 n ( 1+e \ More- 
over, since the number of measurement settings does not scale 
with the size of the system, the total number of measurements 
scales as 0(2™( 1+e )) which is at least a quadratic improve- 
ment over the number of measurements needed to perform 
brute-force tomography on a generic state of n qubits. 



Extension to continuous variables systems 

The Monte Carlo method proposed here can be adapted 
to continuous variable systems, such as a single electromag- 
netic field mode in a cavity ll35U37l . by modifying how 
the state is parameterized and how the sampling is per- 
formed. The main reason for this is the obvious difficulty 
of measuring observables in a discrete infinite dimensional 
operator basis. This problem can be avoided by consider- 
ing phase-space quasiprobability distribution descriptions of 
quantum states. If we consider the dual phase-space distribu- 
tions fp(a) and gp (a) which correspond respectively to the 
quantum state p and an observable P, [38], then trpPi = 
~ J c d 2 a fp(a)gp (a). It follows that the fidelity between a 
pure state p and an arbitrary state a is given by F(p, a) = 
tr p u = i f c d 2 a fp(a) ga(a), which can be re-written as 



where the integration ex- 

/!(«) is 



F{p,a) = ^/ C A P («) 
eludes regions with f p {a) = and where p(a 
the relevance density function. Sampling the relevance density 
can be done by standard methods, such as rejection sampling. 

The choice of phase space distributions is important, as it 
must be possible to interpret /J (a) as probability distribu- 
tions, and it must be possible to estimate ga(a) at some ar- 
bitrary aeC easily from experimental data. One choice that 
fulfills both these requirements for all states is the Wigner 
function ifTTl [38). The Wigner function is self-dual and 
bounded in magnitude by 2, and its value at particular a can 
be estimated by using simple experiments where the contin- 
uous variable system, such as an electromagnetic field mode, 
interacts with an atom lfl8l l37l l39l l40l . 

The same truncation technique used to evaluate the perfor- 
mance of this algorithm for qubits can be used for continuous 
variable systems. Amplification of experimental uncertainty 
can once again by reduced by placing a cut-off in the rele- 
vance density function. If we disregard regions in phase space 
where the absolute value of the relevance density is below c, 



then the error E in the fidelity is bounded by 
1 



E = 




d z a Wp{a)W a {a) 



<- ] IJ i <PaW?(a) 



(16) 
(17) 



where / is the region in phase space where \ Wp\ < c. 



Sampling from the relevance distribution 

Sampling from the relevance distribution Pr(i) is not trivial 
because the dimension of the operator space on n particles is 
exponentially large in n. Therefore, computing all p t = tr pp 
for all observables p is unefficient. Furthermore, computing 
a given p L can be a challenging task in itself. However, by 
choosing operators p = pf 1 ® . . . (g> that are tensor prod- 
ucts of single-particle operators — such as the Pauli operators 
for qubits — sampling can be simplified by recursively picking 
the observables for each particle as we now demonstrate. 



Sampling using conditional probabilities 

Consider for concreteness a system composed of n qubits, 
and an operator basis p all consisting of tensor product of 
single qubit operators, e.g. Pauli operators. The Hilbert space 
dimension is d = 2 n . For an observable p = <2)™ =1 K , 
denote the relevance distribution Pr(i) = ^ » . Using the 
probability chain rule, this probability can be expressed as a 
product of conditional probabilities 



Qix,...,in 



n 1*1. 



(18) 



k=l 



where the conditional probability ?i fc |i 1 ,...,i A ._ 1 of drawing the 

observable pf^ on particle k knowing which observables have 
been picked on the previous particles is 



ft!,... 



E 



i fti. 



(19) 



Using equation (TT8J, sampling from the probability distribu- 
tion reduces to sequentially picking an observable ac- 
cording to the conditional probability distribution ( fT9| ) which 
can be written, up to a normalization factor, as 



oc 



Pev 



= tr 




7 



where the trace of two copies accounts for the square in the 
definition of Pr(i) = = MMgWil , T he sum over 

all duplicated observables P® 2 can be written as the tensor 
product of operators acting on each pair [to, n + m] of parti- 
cles 



2 -(n-fc) Pi 



PeV„ 



where Cl J = |^] m pill ®fim is an observable acting on the 
pair of particles For instance, for the Pauli operator ba- 

sis, Cl is the SWAP operator. Thus, the conditional probability 
is proportionnal to 



P 



a \rn>, n+ml 

n l 1 (20) 



m—k-\-l 



tr 




[m, n+m] 



(21) 



In the generic case of a state defined as a vector of the 
Hilbert space, computing the expectation value of a single lo- 
cal observable will take time 0(2 2n ) since we have to account 
for the Hilbert space of 2n qubits. A tensor product of local 
observables can be thought as the product of 0(n) observ- 
ables that act non-trivially on a few qubits. Thus, computing 
the expectation value given by equation ( f2T| will take time 
O (n2 2n ). In order to sample, such a computation has to be 
repeated for each particles, leading to an overall complexity of 
sampling from the relevance distribution of O (n 2 2 2 ™) in the 
worst case. Learning algorithms based on compressed sensing 
can recover low -rank density matrices from O (n2 n ) expec- 
tation values in any basis lfT2l . which indicates that it may 
be possible to improve the performance of the algorithm pro- 
posed here in the case of general pure states. 



Lieb-Robinson bound 



which is the expectation value of a tensor product of 2-local 
observables on the state p (g> p on 2n particles. 



Bound on the complexity of sampling 

The problem of sampling reduces to, for each of the n 
particles, i) computing conditional probabilities for each of 
the possible observables acting on that particle ii) pick one 
of those observables by generating a random number. Con- 
ditional probabilities can be expressed as expectation values 
through eq. ( pTj l. Thus, if computing expectation values on 
tensor product of local observables on states of n particles has 
complexity q(n), generating an index i = i\ . . . i n from the 
relevance distribution Pr(i) has complexity at most n x q(2n). 

For many states of interest, computing expectation values 
of local observables can be performed in polynomial time, i.e., 
q(n) E poly(n). That is the case for many families of tensor- 
network states such as matrix product states (MPS) [41] which 
are known to represent faithfully ground states of interesting 
many-body Hamiltonians in ID fl5l . In fact, the procedure 
outlined above can be simplified in the case of MPS, yielding 
a sampling complexity linear in n, see Fig. [2] Their natural 
extension to 2D, projected entangled pair states (PEPS) 11421 
also allows the efficient heuristic computation of such expec- 
tation values. 

A larger class of multi-qubit states for which sampling can 
be done efficiently by computing conditional probabilities are 
computationally tractable (CT) states [43 1. CT states are states 
in which (a) the overlap with any element of the computa- 
tional basis can be computed efficiently, and (b) it is possible 
to sample from the distribution of outcomes from measure- 
ments in the computational basis efficiently. For such states, 
it is possible to efficiently compute the expectation value of 
tensor products of Pauli observables which only permute ele- 
ments of the computational basis and thus are basis preserv- 
ing. 



The characterization of local Hamiltonians and Lindbla- 
dians relies heavily on the Lieb-Robinson bound l32l l44ll 
that shows that a local Hamiltonian generates a causal evo- 
lution, with effects propagating at a finite velocity v (note 
that this bound has been generalized to the setting of dissi- 
pative systems B31 . so our derivation holds for local Lind- 
bladians as well). A local Hamiltonians acting on n parti- 
cles is of the form H = J2x Hx where X labels subsets of 
n particles, each term has bounded norm < E, and 

acts on at most k neighboring particles, such that Hx = 
when \X\ > k. The evolution of an operator is governed 
by the equation %A{t) — i[H, A]. Break the Hamiltonian 




FIG. 2. Tensor network corresponding to eq. d2Tb if p — \ip){ift\ 
is a MPS, i.e., there exist a familty of matrices | j such that 

= . . . A^j\ii . . . in). The upper figure represent the indi- 
vidual tensors in the tensor network. Each square represent a tensor 
and outgoing legs represent the tensor indices. Two squares con- 
nected by a line are the contraction of the corresponding indices of 
two tensors. Red squares correspond to the operators. Orange 
squares correspond to the Pauli operators already chosen on the k—1 
previous qubits. Blue squares are the MPS tensors of the two copies 
of while the green squares are the MPS tensors of the two copies 
of The lower figure correspond to the partial contraction of the 
tensor network. 
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FIG. 3. When the system evolves under a local Hamiltonian (or 
Lindbladian), the operator A evolves under the full Hamiltonian H 
for a time t is essentially the same as the operator resulting from 
the evolution generated by the Hamiltonian truncated to the region 



R. Mathematically, e Ae~ 



lRt Ae 



R with correc- 



tions that decay exponentially with d, the radius of the region 7?. In 
the figure, the region M represents a membrane of constant thickness 
surrounding the region 7?. 



into H = Hq + Hm, where Hm contains all the terms Hx 
that intersect a membrane M surrounding the operator A (see 
Fig. |3J. The idea of this membrane is to disconnect its inte- 
rior, denoted region R, from the rest of the particles. Indeed, 
e iH t£ e -iB t _ e tH R t^ e -iH R t where Hr in ^ Hamilto- 
nian acting only inside the membrane (see Fig. [3}. The differ- 
ential equation for A(t) is 



d_ 
Of 



A(t) = i[H ,A(t)]+i[H M ,A], 



(22) 



which has solution 

A(t) =e lfiat A(0)e- t{lot 
ft 



i\ e lHM ^- s \H M ,A{s)]e- lHM{t - s) ds (23) 

o 

--e- lRt A(0)e 
la 



iH M (t-s) 



[H M ,A(s)]e- iAM ^- s) ds (24) 



as can be verified directly by differentiation. The commutator 
appearing in the second term can be bounded by 



\[H M ,A(s)]\\ <cV\\A\\\\H M \\exp 



vt 



(25) 



where V is the number of sites in the support of the observable 
A, and c, v, and £ are constant that depend only on the micro- 
scopic details of the system, independent of the system size. 
This is known as the the Lieb-Robinson bound. Integrating, 
we obtain 



-ill jit I 



\\A(t) - e ltlRt A{Q)e 

< c ty||i||||i? M ||||cxp( -- 



(26) 



d — vt 



Expanding the exponential to first order yields 

\\A(t)-A(0)-it[H R ,A(0)]\\ 
<ctV\\A\\\\H M \\\\exp' "'~ r/ 

+ c'\\A\\\\h r \\H 2 . 



(27) 





FIG. 4. Error in estimates of the parameters of local Hamiltonians. 
The systems consist of linear chains of qubits with randomly cho- 
sen 2-local Hamiltonians H — each coefficient has norm uniformly 
distributed between 0.8 and 1.2. Starting in an initial product state, 
the system is evolved for t = 10 -3 , and the expectation of randomly 
chosen observables is measured with precision e. The resulting linear 
constraints Eq. {30) are solved using Moore-Penrose pseudo-inverse 
to obtain an estimated Hamiltonian H. (Top) Distribution of the er- 
i 



i Jtr{H — H) 2 over different realization of the random Hamil- 
tonian for e — 10~ 4 . The red dots correspond to the mean distance 
and the solid lines is a linear fit. (Bottom) Distribution of error scal- 
ing factors — i.e. the factor by which the measurement accuracy e is 
amplified when computing the pseudo-inverse. The red dots indicate 
the average error scaling factor for each chain length (the red line is 
a quadratic fit). 
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Because Hr and Hm represent respectively the Hamiltonian 
of a ball of radius d and the Hamiltonian for a constant thick- 
ness membrane around that ball, they grow proportionally to 
d D and d 0-1 respectively, where D is the spatial dimension, 
i.e., \\Hr\\ < ad D and H-HmII < ad D_1 for some constant 
a. Choosing d « vt + \og(cV/c't) such that 



we obtain 

||i(t)-i(0)-it[F fl ,i(0)]||<K||i|| 



vt + \o g [- 



(29) 



for some constant n = 2c! a 2 . 

For a short time t, the expectation value of any observable 
A evolves as 



(A(t)) p - trip = it([H, A])p + 0(\\Hft 2 ). 



(30) 



By experimentally measuring this expectation value, we ob- 
tain one linear constraint on the Hamiltonian. Varying over 
different observables Ai and initial states pj, we obtain more 



(Mt))t> 



linear constraints that we can write as Wi- 
tr AiPj — it([H, AA)p. where we have dropped the higher 
order terms 0(\\H \\ 2 t 2 ). Writing H in an operator basis 
H = J^i hi Pi, we obtain the linear equation 



Wi 



(31) 



where 7y ; = it tr f>j[Pi, Ai\. The Hamiltonian can be 
learned by inverting this linear equation OTI . 

There are in general four important caveats to this ap- 
proach: 1) the evolution time t must be extremely short 
t <C going to as the number of particles grows; 2) 

there are exponentially many hi to learn; 3) there are exponen- 
tially many observables Ak and initial states pj to be measured 
and prepared experimentally; and 4) the quantities ti Ap and 
([H, A]) p can be exponentially difficult to compute. Based on 
Eq. ( f30] >, all these problems disappear when the Hamiltonian 
is local as described in the main text. 

Numerical experiments were performed for local Hamilto- 
nians, and the results are plotted in Fig. |4] The systems we 
considered were small chains of qubits with random nearest 
neighbour interactions. The system evolution was calculated 
exactly for a short amount of time, and the linearized problem 
was inverted using the Moore-Penrose pseudoinverse. Since 
these Hamiltonians are drawn at random (but with maximum 
strength for each term independent of the system size), we cal- 
culate the average I2 distance between the estimated Hamilto- 
nian and the actual Hamiltonian (top of Fig. |4j, as well as the 
quantiles for error propagation scaling factor of each of the 
elements of hi, given 

by Eij \ T t,f ( bottom of Fig. |4|). The 
results clearly indicate well behaved error scaling for these 



systems, even under finite statistical error in the estimation of 
observable expectations. 
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